SegGen: A Genetic Algorithm for Linear Text Segmentation
نویسندگان
چکیده
This paper describes SegGen, a new algorithm for linear text segmentation on general corpuses. It aims to segment texts into thematic homogeneous parts. Several existing methods have been used for this purpose, based on a sequential creation of boundaries. Here, we propose to consider boundaries simultaneously thanks to a genetic algorithm. SegGen uses two criteria: maximization of the internal cohesion of the formed segments and minimization of the similarity of the adjacent segments. First experimental results are promising and SegGen appears to be very competitive compared with existing methods.
منابع مشابه
A Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملImproving Brain Magnetic Resonance Image (MRI) Segmentation via a Novel Algorithm based on Genetic and Regional Growth
Background:Â Regarding the importance of right diagnosis in medical applications, various methods have been exploited for processing medical images solar. The method of segmentation is used to analyze anal to miscall structures in medical imaging.Objective:Â This study describes a new method for brain Magnetic Resonance Image (MRI) segmentation via a novel algorithm based on genetic and regiona...
متن کاملIdentification of High Crash Road Segment using Genetic Algorithm and Dynamic Segmentation
This paper presents an evolutionary algorithm for recognizing high and low crash road segments using Genetic Algorithm as a dynamic segmentation method. Social and economic costs as well as physical and mental injuries make the governments perceiving to road safety indexes in order to diminish the consequences of road accidents. Due to the limitation of budget for safety...
متن کاملUsing Multiple Discriminant Analysis Approach for Linear Text Segmentation
Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarization, information retrieval and text understanding. However, for linear text segmentation, there are two critical problems involving automatic boundary detection and automatic determination of the number of segments in ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007